University of Delaware Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory Collaborative Research: Programming Models and Storage System for High Performance Computation with Many-Core Processors

نویسندگان

  • Jack B. Dennis
  • Guang R. Gao
  • Vivek Sarkar
چکیده

Future generation HEC architectures and systems built using many-core chips will pose unprecedented challenges for users. A major source of these challenges is the storage system since the available memory and bandwidth per processor core is starting to decline at an alarming rate, with the rapid increase in the number of cores per chip. Data-intensive applications that require large data sets and/or high input/output (I/O) bandwidth will be especially vulnerable to these trends. Unlike previous generations of hardware evolution, this shift in the hardware road-map will have a profound impact on HEC software. Historically, the storage architecture of an HEC system has been constrained to a large degree by the file system interfaces in the underlying Operating System (OS). The basic design of file systems has been largely unchanged for multiple decades, and will no longer suffice as the foundational storage model for many-core HEC systems. Instead, it becomes necessary to explore new approaches to storage systems that reduce OS overhead, exploit multiple levels of storage hierarchy and focus on new techniques for efficient utilization of bandwidth and reduction of average access times across these levels. The specific focus of this proposal is on exploring a new storage model based on write-once tree structures, which is radically different from traditional flat files. The basic unit of data transfer is a chunk, which is intended to be much smaller than data transfer sizes used in traditional storage systems. The write-once property simplifies the memory model for the storage system and obviates the need for complicated consistency protocols. We will explore three programming models for users of the storage system, all of which can inter-operate through shared persistent data: 1) a declarative programming model in which any data structure can be directly made persistent in our storage system, with no programmer intervention, 2) a strongly-typed imperative programming model in which a type system extension will be used to enforce a separation between data structures that can be directly made persistent and those that cannot, and 3) a weakly-typed runtime interface that enables C programs to access our storage system. A compiler with a high-level data flow intermediate representation and lower-level parallel intermediate representation will provide the necessary code generation support, and a runtime system will implements interfaces to the storage system used by compiler-generated code and by the weakly-typed runtime interface. Our proposed research will be evaluated using an experimental testbed that …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Delaware Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory Experiments with the Fresh Breeze Tree-Based Memory Model

Recent developments have brought to the forefront some pressing and difficult problems concerning the usability of computer systems: lack of a satisfactory general purpose programming model for parallel computation; how to achieve efficient utilization of processing and memory resources; and system resilience in the presence of malicious attacks and the expectation that future hardware will be ...

متن کامل

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...

متن کامل

University of Delaware Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory Synchronization for Dynamic Task Parallelism on Manycore Architectures

Manycore architectures –hundreds to thousands of cores per processor – are seen by many as a natural evolution of multicore processors. To take advantage of this massive parallelism in reality requires a productive programming interface for parallel programming, and an efficient execution and thread coordination runtime. Dynamic task parallelism, introduced recently in several programming langu...

متن کامل

Efficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems

Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...

متن کامل

Green Energy-aware task scheduling using the DVFS technique in Cloud Computing

Nowdays, energy consumption as a critical issue in distributed computing systems with high performance has become so green computing tries to energy consumption, carbon footprint and CO2 emissions in high performance computing systems (HPCs) such as clusters, Grid and Cloud that a large number of parallel. Reducing energy consumption for high end computing can bring various benefits such as red...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009